POSTECH at NTCIR-5
نویسندگان
چکیده
This paper describes methodologies for NTCIR-5 CLIR involving Korean and Japanese, and reports the official result as well as retrieval results using NTCIR-3 and NTCIR-4 data. We participated in four tasks: K-K and J-J monolingual tracks and K-J and J-K cross-lingual tracks. Unlike English, in Asian languages such as Korean and Japanese term extraction is nontrivial because of segmentation ambiguities. In this regard, we prepared multiple term representations for documents and queries, of which ranked results are merged to generate final ranking. In preliminary experiments using NTCIR-3 and NTCIR-4 data, our model showed the best performances for description queries in Korean and Japanese. In offline results using NTCIR-5 data, our methodology in Korean showed the best performance by archieving 0.5680 for description queries and 0.6159 for others.
منابع مشابه
POSTECH at NTCIR-5 Patent Retrieval: Smoothing Experiments in a Language Modeling Approach to Patent Retrieval
This report describes the experimental results of our participation at the Document Retrieval Subtask of NTCIR-5 Patent Retrieval Task. Unlike newspaper articles which belong to the main document type handled in previous information retrieval experiments, patent documents have many different characteristics in terms of length, technicality, structureness, etc. Among these, we focus on the lengt...
متن کاملPOSTECH at NTCIR-6 English Patent Retrieval Subtask
This paper reports our experimental results at the NTCIR-6 English Patent Retrieval Subtask. Our previous participation at the patent retrieval Subtask revealed that the long length of the patent applications require less smoothing of the document model than general documents such as news paper articles. We setup the initial baseline retrieval system for U.S. patent applications and compare the...
متن کاملThe POSTECH Statistical Machine Translation Systems for NTCIR-7 Patent Translation Task
This paper describes the POSTECH statistical machine translation (SMT) systems for the NTCIR-7 patent translation task. We entered two patent translation subtasks: Japanese-to-English (KLE-je), and English-toJapanese translation (KLE-ej). The baseline systems are derived from a common phrase-based SMT framework. In addition, for Japanese-to-English translation, we adopted two kinds of methods. ...
متن کاملPOSTECH Question-Answering Experiments at NTCIR4-QAC
This paper describes our system and additional experimental results in NTCIR-4 QAC Task 1. The main components of our system are question classification, passage retrieval, and named entity extraction. Passage retrieval was performed by a density-based ranking method based on importance of query terms occurred in the passage. Question classification and Named entity extraction were designed by ...
متن کاملPOSTECH at NTCIR-6: Combining Evidences of Multiple Term Extractions for Mono-lingual and Cross-lingual Retrieval in Korean and Japanese
This paper describes our methodologies for NTCIR-6 CLIR involving Korean and Japanese, and reports the official result for Stage 1 and Stage 2. We participated in three tracks: K-K and J-J monolingual tracks and J-K cross-lingual tracks. As in the previous year, we focus on handling segmentation ambiguities in Asian languages. As a result, we prepared multiple term representations for documents...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005